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This listing of claims will replace all prior versions, and listings, of claims in the application. 
Listing of Claims: 

1. (original) A method for automatically classifying consonance of audio data, comprising: 
applying audio data to a peak detection process; 

detecting the location of at least one prominent peak represented by the audio data in the 
frequency spectrum and determining the energy of the at least one prominent peak; 

storing the location of the at least one prominent peak and the energy of the at least one 
prominent peak into at least one output matrix; 

applying the data stored in said at least one output matrix to critical band masking 
filtering; 

applying the data stored in said at least one output matrix to a peak continuation process; 

and 

applying the data stored in said at least one output matrix to an intervals calculation 
process where the frequency of ratios between peaks are stored into an output vector for the 
audio data being classified. 

2. (original) A method according to claim 1, wherein the audio data is divided into frames, 
and the method is performed frame by frame. 

3. (original) A method according to claim 2, wherein the frame by frame approach includes 
bin differencing to calculate frame derivatives to facilitate the detection of peaks. 
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4. (original) A method according to claim 2, wherein the number of peaks detected in said 
application of the peak detection process is limited by a pre-defined parameter. 

5. (original) A method according to claim 1, further comprising performing Nth order 
interpolation on the location of the at least one prominent peak and the energy of the at least one 
prominent peak to increase precision of the location and energy values for the peak. 

6. (original) A method according to claim 1, further comprising applying the output vector 
to a classification stage which determines at least one of (1) at least one consonance value and 
(2) at least one consonance class that describes the audio data. 

7. (original) A method according to claim 1, where the frequency of ratios between peaks 
are stored into an output vector that is 1 x 24. 

8. (original) A method according to claim 2, wherein the peak continuation process keeps 
track of peaks that last more than a predetermined number of frames 

9. (original) A method according to claim 8, wherein the peak continuation process fills in 
a peak when the peak is missed in a previous frame. 
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10. (original) A method according to claim 1, wherein said critical band masking filtering 
removes a peak that is masked by surrounding peaks with more energy 

1 1 . (original) A method according to claim 10, wherein said critical band masking filtering 
removes a peak when at least one of a lower frequency peak and a higher frequency peak have 
greater energy. 

12. (original) A method according to claim 10, wherein said critical band masking filters are 
scalable so that the amount of masking is scalable. 

13. (original) A method according to claim 1, wherein said storing includes providing an 
output of the peak detection and interpolation stage in two matrices, one holding the location of 
the at least one prominent peak, and the second holding the respective energy of the at least one 
prominent peak. 

14. (original) A method according to claim 1, wherein the audio data is formatted according 
to pulse code modulated format. 

15. (original) A method according to claim 14, wherein the audio data is previously in a 
format other than pulse code modulated format, and the method further comprises converting the 
audio data to pulse code modulated format from the other format. 
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16. (original) The method of claim 1, further comprising converting the input audio data 
from the time domain to the frequency domain. 

17. (original) A method according to claim 16, wherein said converting of the input audio 
data signal from the time domain to the frequency domain includes performing a fast fourier 
transform on the audio data. 

18. (currently amended) A computer readable medium bearing computer executable 
instructions for carrying out th e m e thod of claim 1 . 

applying audio data to a peak detection process; 

detecting the location of at least one prominent peak represented by the audio data in the 
frequency spectrum and determining the energy of the at least one prominent peak; 

storing the location of the at least one prominent peak and the energy of the at least one 
prominent peak into at least one output matrix; 

applying the data stored in said at least one output matrix to critical band masking 
filtering; 

applying the data stored in said at least one output matrix to a peak continuation process; 

and 

applying the data stored in said at least one output matrix to an intervals calculation 
process where the frequency of ratios between peaks are stored into an output vector for the 
audio data being classified. 
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19. (currently amended) A modulated data signal carrying computer executable instructions 
for p e rforming th e m e thod of claim 1 . 

applying audio data to a peak detection process; 

detecting the location of at least one prominent peak represented by the audio data in the 
frequency spectrum and determining the energy of the at least one prominent peak; 

storing the location of the at least one prominent peak and the energy of the at least one 
prominent peak into at least one output matrix; 

applying the data stored in said at least one output matrix to critical band masking 
filtering; 

applying the data stored in said at least one output matrix to a peak continuation process; 

and 

applying the data stored in said at least one output matrix to an intervals calculation 
process where the frequency of ratios between peaks are stored into an output vector for the 
audio data being classified. 

20. (currently amended) At least one computing device comprising m e ans one or more 
subsystems for: p e rforming th e m e thod of claim 1 . 

applying audio data to a peak detection process; 

detecting the location of at least one prominent peak represented by the audio data in the 
frequency spectrum and determining the energy of the at least one prominent peak; 

storing the location of the at least one prominent peak and the energy of the at least one 
prominent peak into at least one output matrix; 

Page 6 of 18 



I 



DOCKET NO.: MSFT-0580/167506.02 PATENT 

Application No.: 09/900,059 

Office Action Dated: February 23, 2004 

a pplying the data stored in said at least one output matrix to critical band masking 
filtering; 

a pplying the data stored in said at least one output matrix to a peak continuation process; 

and 

a pplying the data stored in said at least one output matrix to an intervals calculation 
process where the frequency of ratios between peaks are stored into an output vector for the 
audio data being classified. 

21 . (original) A method of classifying data according to consonance properties of the data, 
comprising: 

assigning to each media entity of a plurality of media entities in a data set to at least one 
consonance class; 

processing each media entity of said data set to extract at least one consonance 
characteristic based on digital signal processing of each media entity; 

generating a plurality of consonance vectors for said plurality of media entities, wherein 
each consonance vector includes said at least one consonance class and at least one consonance 
characteristic based on digital signal processing; and 

forming a classification chain based upon said plurality of feature vectors. 

22. (original) A method according to claim 21, further comprising: 

processing an unclassified media entity to extract at least one consonance characteristic 
based on digital signal processing of the unclassified media entity; 
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generating a vector for the unclassified media entity including said at least one digital 
signal processing consonance characteristic; 

presenting the vector for the unclassified media entity to the classification chain; and 
classifying the unclassified entry with an estimate of the consonance class by calculating 
the representative consonance class of the subset of the plurality of vectors of the classification 
chain located in the neighborhood of the vector for the unclassified entity. 

23. (original) A method according to claim 22, further including calculating a neighborhood 
distance that defines a distance within which two vectors in the classification chain space are in 
the same neighborhood for purposes of being in the same consonance class. 

24. (original) A method according to claim 22, wherein said classifying of the unclassified 
entry includes classifying the unclassified entry with a median consonance class represented by 
the neighborhood. 

25. (original) A method according to claim 22, wherein said consonance class is described 
by a numerical value and said classifying of the unclassified entry includes classifying the 
unclassified entry with a mean of numerical consonance values found in the neighborhood. 

26. (original) A method according to claim 22, wherein said classifying includes returning at 
least one number indicating the level of confidence of the consonance class estimate. 
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27. (currently amended) A computer readable medium bearing computer executable 
instructions for carrying out the m e thod of claim 21. 

assigning to each media entity of a plurality of media entities in a data set to at least one 
consonance class; 

processing each media entity of said data set to extract at least one consonance 
characteristic based on digital signal processing of each media entity; 

generating a plurality of consonance vectors for said plurality of media entities, wherein 
each consonance vector includes said at least one consonance class and at least one consonance 
characteristic based on digital signal processing; and 

forming a classification chain based upon said plurality of feature vectors. 

28. (currently amended) A modulated data signal carrying computer executable instructions 
for: performing th e m e thod of claim 21 . 

assigning to each media entity of a plurality of media entities in a data set to at least one 
consonance class; 

processing each media entity of said data set to extract at least one consonance 
characteristic based on digital signal processing of each media entity; 

generating a plurality of consonance vectors for said plurality of media entities, wherein 
each consonance vector includes said at least one consonance class and at least one consonance 
characteristic based on digital signal processing; and 

forming a classification chain based upon said plurality of feature vectors. 
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29. (currently amended) At least one computing device comprising means for p e rforming 
th e m e thod of claim 21. 

assigning to each media entity of a plurality of media entities in a data set to at least one 
consonance class; 

processing each media entity of said data set to extract at least one consonance 
characteristic based on digital signal processing of each media entity; 

generating a plurality of consonance vectors for said plurality of media entities, wherein 
each consonance vector includes said at least one consonance class and at least one consonance 
characteristic based on digital signal processing; and 

forming a classification chain based upon said plurality of feature vectors. 

30. (original) A computing system, comprising: 
a computing device including: 

a classification chain data structure stored thereon having a plurality of classification 
vectors, wherein each vector includes data representative of a consonance class as classified by 
humans and consonance characteristics as determined by digital signal processing; and 

processing means for comparing an unclassified media entity to the classification chain 
data structure to determine an estimate of the consonance class of the unclassified media entity. 

31. (original) A computing system according to claim 30, wherein said determining of an 
estimate of the consonance class includes returning at least one number indicating the level of 
confidence of the consonance class assignment. 
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32. (original) A method according to claim 3 1 , wherein the performance level of the 
classification chain improves over time due to the examination of unclassified media entities that 
have a low confidence level associated with the consonance class assignment. 

33. (original) A classification chain data structure utilized in connection with the 
classification of consonance of new unclassified media entities, comprising: 

a plurality of classification vectors, wherein each vector includes: 
consonance data as classified by humans; and 
consonance data determined by digital signal processing techniques. 

[Remainder of Page Intentionally Left Blank] 
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